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Abstract 

We show that the tracking system in a colhder detector can be used to efficiently identify boosted 



massive particles from their QCD backgrounds. We examine variables defined with tracking in- 
formation which are sensitive to jet radiation patterns, including charged particle multiplicity and 



I N-subjettiness. These variables are barely correlated with variables sensitive to the hard splitting 



scale in the jet, such as the filtered jet mass. Therefore these two kinds of variables should be 
combined to optimize the discriminating power. We illustrate the method with W jet tagging. 
I It is shown that for jet pT = 500 GeV, one can gain a factor of 1.6 in statistical significance by 

combining filtered jet mass and charged particle multiplicity, over filtered mass alone. Adding 
N-subjettiness increases the factor to 1.8. 

X: 



1. INTRODUCTION 



Highly boosted massive standard model (SM) particles are important probes to TeV 
scale new physics at the Large Hadron Collider (LHC). These particles include the W and 
Z gauge bosons, the top quark, and possibly the Higgs boson if it is light. For example, it 
is essential to measure how the WW scattering cross section grows with increasing center 
of mass energy, in which case boosted W bosons are involved |2| . When these particles 



decay hadronically, we have to identify them from their QCD backgrounds. See Refs 



29| for previous studies. For this purpose, a large boost brings us both advantages and 



disadvantages. On the one hand, since the decay products are coUimated, they are often 
clustered into a single jet and we are exempted from the combinatorial problem associated 

with unboosted particles. Sometimes it is even convenient to use a large jet radius to group 

' II 
J 

as many as possible such particles to single fat jets [3^]. We will call these jets W /Z /top jets 
and in general, boosted massive particle jets or simply boosted jets. On the other hand, since 
they behave as a single jet, we need to distinguish them from high QCD jets, namely, 
jets initiated from a high px quark or gluon. Because of QCD radiation, the jet mass alone 
is not a good discriminant especially when we choose a large jet radius. 

To distinguish massive particle jets from QCD jets, we can utilize two differences between 
them. First, compared with QCD splittings, which are usually hierarchical, the momentum 
of a boosted massive particle is more evenly distributed among its decay products. This 
results in more than one hard subjet within the fat jet for a massive particle, while only one 
hard subjet for most QCD jets. Several algorithms s], 5, 30| have been invented to identify 



subjets and "groom" the fat jet by discarding the soft subjets. After grooming, the mass of 
a boosted jet remains close to the decaying particle's mass, while that of a QCD jet usually 
becomes very small. This allows us to use a mass window cut to eliminate most QCD jets 
and retain the massive particles we are interested in. To get the characteristic mass of a 
particular particle, it is important we measure the momenta of all stable particles, which 
is possible only with calorimeters. Therefore, in most previous jet substructure studies, 
hadronic calorimeter (HCAL) granularity is assumed. 

The second difference between boosted jets and QCD jets stems from their different color 
structures: QCD jets are initiated from a colored particle, while W/Z/Higgs bosons are 
color singlets. The color flow of the top decay is also different from a generic QCD jet 
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with three hard subjets. This difference results in different radiation patterns, w 



lich are 



2^ and 



manifest in jet shape variables such as planar flow [4], pull 19|, N-subjettiness 
i?-cores j^sl- Unlike the jet mass, to define and examine these variables, we do not need 
to have the information of all stable particles. In particular, we can take advantage of the 
tracking system, which has much finer granularity as well as better momentum resolution 
for low to moderately high momenta. Obviously, the disadvantage of using the tracking 
information is it is only available for charged particles whose fraction fluctuates. Therefore, 
it is complementary to calorimeter information. 

In this article, taking W jets as an example, we discuss how to use tracking information 
to distinguish boosted massive particle jets from QCD jets. We explore variables sensitive 
to jets' radiation patterns. The simplest such variable is charged particle multiplicity, which 
nonetheless shows excellent discriminating power. All previously defined jet shape variables 
can also be calculated with charged particles alone. In particular, we will see N-subjettiness 
is very useful. These variables are not very sensitive to the mass scale of the particle decay 
or QCD splitting, and they should be combined with a jet grooming algorithm to optimize 
the discriminating power. Therefore, we examine these variables for jets containing two 
hard subjets (identified with the filtering algorithm) with masses close to the W mass. The 
performances of these variables are conveniently quantified by the significance improvement 



characteristic (SIC) which is defined as [31| 



SIC = es/V^, (1) 

where es and Eb are respectively the signal {W jets) and background (QCD jets) efficiencies. 
It is shown that for the LHC, by combining charged multiplicity (or N-subjettiness) with 
filtering, we achieve an SIC of ~ 1.6 over the filtering method alone. This approach also 
gives better discriminating power than methods that combine correlated variables, such as 
in Ref. 10|, where different jet grooming algorithms are combined. Combining filtered mass, 
charged multiplicity and N-subjettiness all together, we can reach an SIC of 1.8. 

The article is organized as follows. In Sec. |2l we illustrate the difference between W jets 
and QCD jets at e~^e~ machines, which provide a clean environment without contaminations 
from initial state radiation and the underlying event. In Sec. [31 we examine jet substructure 
variables at the LHC and quantify their performances. We discuss our results in Sec. HJand 
conclude in Sec. [51 
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2. W JETS AND QCD JETS AT e+e" MACHINES 

We first compare W jets with QCD jets produced at e~^e~ machines, where jet substruc- 
ture is not contaminated by initial state radiation and the underlying event. The lessons 
learned in this section can be easily adapted for the LHC, which we discuss in the next 
section. 



A. Charged particle multiplicity 

Charged particle multiplicity is a simple but powerful variable that is sensitive to the jet 
color structure. For example, it can be used to distinguish gluon jets from quark jets |32 |. 



Average charged particle multiplicities {{Nch)) in inclusive hadronic events have been mea- 

33| and reproduced in 



sured at a number of e e~ machines. The results are compiled in Ref. 
Fig. [TJ These measurements include a variety of center of mass energies from 12 GeV at PE- 
TRA, up to LEP 2 energies. Theoretically, the absolute value of (Nch) cannot be predicted 
because it involves non-perturbative physics. However, its scaling can be described by the 



modified leading log approximation (MLLA) and local parton-hadron duality (LPHD) 34| . 



see Ref. j36| for a review. The prediction from MLLA+LPHD is shown in Fig. [H together 
with the prediction from Pythia 8 [41] simulations. We see that the MLLA+LPHD fit and 
the Pythia 8 prediction are almost identical, which agree with data excellently. 

In Fig. [H we also notice that charged particle multiplicity grows slowly especially at high 
energies. This is an example of the fact that the radiation pattern is more sensitive to the 
color structure than the energy scale. This effect has important consequence as will be seen 
in our study of jet substructure. 

For the W boson, a color singlet particle, the average charged multiplicity should be the 



same as in Fig. [T] at ^/s = Mw , which was confirmed at LEP 2 [35[ . Not shown in Fig. [T] is 
the dispersion of the charged multiplicity (defined as {{Nch — {Nch))^y^^) , which is ~ 6 for 
the W boson. Except for experimental effects, the charged particle multiplicity distribution 
is invariant under the boost to the W boson, which allows us to use it to identify W jets. 

The inclusive charged multiplicity is not directly applicable for identifying W jets since 
jets are defined for a finite spatial region. We then examine charged multiplicities for individ- 
ual jets using Pythia 8 simulations. As mentioned in the introduction, a QCD jet without a 
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<Nch> 




FIG. 1: Average charged particle multiplicities as a function of center of mass energy measured 



at e^e machines (points) 
simulations (red dashed). 



33|, together with the MLLA+LPHD fit (blue solid) and Pythia 8 



hard splitting can be easily distinguished from a W jet by using the jet grooming algorithms, 
therefore, we focus on QCD jets with a hard splitting (2-prong QCD jets), which mimic a 
W jet more closely. Nonetheless, we will also include 1-prong QCD jets for comparison. 
For illustration, we consider e+e~ — )■ W^W~ qqlv events and e^e" — ?■ qqg events in the 
following fixed configurations (Fig. [2]): the hadronically decaying W moves along the x axis 
which decays to two quarks with symmetric 4-momenta, 

where pr is fixed to be 500 GeV. For e+e^ — )■ qqg, we choose the quark and the gluon to 
mimic the partons from a W decay, therefore, the 4-momenta are 

^ _ ( \/pI+^w pt n Mw] ^ - ( \/pt+^w pt n _Mw\ 

= (PT, -PT,0,0). (3) 

There is another configuration with the above momenta in which the two quarks are close 
to each other and the gluon is in the opposite hemisphere. This configuration happens much 
less often than the one showing in Fig. [21 and we have ignored it in our illustration. In a 
realistic situation at the LHC as will be discussed in Sec. [3l we should include all possible 
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FIG. 2: Fixed momentum configurations for illustration. Left: WW — )• qqlu with the hadronic 
W moving perpendicularly to the beam and decaying to quarks with symmetric momenta. Right: 
e~^e~ —7- qqg with a qg pair mimicking a W boson. 

configurations contributing to a high jet. This is more conveniently done with Pythia 8 
or other simulations. 

We keep the momentum and color flow configurations fixed as in Fig. [2l and repeatedly 
use Pythia 8 to simulate showering and hadronization and obtain two data samples cor- 
responding to the two processes. We cluster stable particles to anti-Zcj jets (i?=1.2) with 
Fast Jet [42|. Each WW event then contains a W jet in the upper hemisphere, and each 
qqg event contains a 2-prong jet in the upper hemisphere and a 1-prong jet in the lower 
hemisphere. No cut is used in this procedure. The average number of charged particles for 
the W jet, the 2-prong and 1-prong QCD jets are shown in Fig. [31 From Fig. [3l we see 
the average charged multiplicity of a jet is larger than that of a 1-prong QCD jet while 
smaller than a 2-prong QCD jet. This is due to their different color structure. In particular, 
the 2-prong QCD jet is color connected to the other side of the event, therefore it contains 
more radiation than the W jet. To distinguish a W jet from a 2-prong QCD jet, we can 
apply a cut A^ch < ^ch*- -^^^ example, when A^^j^* = 19, we keep 63% W jets and 7.7% 
2-prong QCD jets, which boosts the SIC by a factor of 2.3. Because of the large boost and 
the large jet radius, R = 1.2, almost all particles from the W decay are included in the 
W jet. Due to charge conservation, the W jet (almost) always contains an odd number of 
charged particles. If we keep only jets with odd number of charged particles, we obtain a 
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larger SIC. However, this feature is easily lost due to experimental acceptance, and we do 
not pursue this possibility further. 
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FIG. 3: Charged particle multiplicities for W jets, 2-prong and 1-prong QCD jets with pT 
500 GeV, in the fixed momentum configurations of Fig. [2] (see text). 



B. N-subjettiness 

Other existing jet shape variables can be defined with charged particles too. Here, we 



take N-subjettiness as an example, which is derived from N-jettiness 43| and defined in 



Ref. as follows. For a set of particles {2} in a jet with radius Rq and a set of axes 



{J}, we define the distance ARji = <^ Ar]j- + A0jj for each (J, i) pair. Then we define a 
quantity 



'N = ^ Ep^.^ min{A<„ . . . AR% . . . A<J, (4) 

where do = Yl,iPT,iRo and /3 is a pre-selected constant. We vary the directions of the N 
axes to find the minimum f^'', which is defined as A^-subjettiness, r^''. 

In our example with fixed momentum configuration, we can simply use the momenta 
in Eqs. ([2]) as the two axes to calculate f!f\ which does not differ significantly from the 
true T2^'' after minimization. The Tg^'* distributions for W jets and 2-prong QCD jets are 
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shown in Fig. HI where we have chosen /3 = 1. The variable T2 quantifies how hkely a given 
jet contains two hard subjets, a smaller value corresponding to a larger likelihood. In our 
special momentum configuration, we always have two hard subjets. In this case, T2 becomes 
a measure of how diffuse the radiation is. As expected, more often T2 is larger for 2-prong 
QCD jets than W jets, which allows us to apply a cut, T2 < Tg^*, to suppress background 
QCD jets. In the general case, when the jet momentum configuration is not fixed, a better 




FIG. 4: 2-subjettiness with axes fixed to Eq. ([2]), for jets with pT = 500 GeV in tlie fixed momentum 
configurations of Fig. [2j 

variable is T2/T1 [24], which will be discussed in the next section for the LHC 



3. W JET TAGGING AT THE LHC 



We now turn to the LHC, where W jet tagging is much more challenging due to presence 
of initial state radiation and the underlying event. As an example, we consider W jets and 
QCD jets in the pt range (500,550) GeV. These jets are obtained as in Ref. [25|: we use 
Pythia 8 to generate high pt WW pairs which decay semiletonically, and VF+jet events with 
the W decaying leptonically. Each WW event then contains a W jet, while each PF+jet 
event contains a high pt QCD jet. The visible stable particles in these events are grouped 
in 0.1 X 0.1 bins in the (77, 0) plane, corresponding to the HCAL granularity. Jets are found 
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with the Cambridge/ Aachen algorithm {R = 1.2). Since we are interested in jets with a hard 
sphtting and mass close to the W mass, we apply the filtering algorithm with the mass drop 
method and examine further jets passing the filtered jet mass (mgit) cut, (60, 100) GeV. The 
mass drop parameters are /i = 0.71 and y^ut = 0.09, which give a factor of 2.2 improvement 
in the SIC over the original fat jets. With this cut, we reduce the number of QCD jets 
by 91% and at the same time retain 66% W jets. Events passing this cut all have a hard 
splitting, therefore we can apply the lessons learned from the previous section. 

We identify tracks with pt > 1 GeV and \ri\ < 2.5 in a given jet, and use them to 
calculate the charged particle multiplicity and N-subjettiness. Note that although the jet 
has passed the filtered mass window cut, we include all tracks in the original fat jet, which 
give us more information about the jet color structure. The most efficient N-subjettiness 
variable for W^-tagging is T2/T1, where we have set /3 = 1 in Eq. and used the code in 
Ref. j4^. In addition, the filtered jet mass distributions still differ for W jets and QCD 
jets after imposing the (60, 100) GeV mass window cuts. We therefore include the filtered 
mass as well. The three variables under consideration are shown in Fig. [5l From Fig. |5l we 
see Nch and T2/T1 have similar features as in a e~^e~ machine (Fig. [3] and Fig. HI), but the 
distinctions between QCD jets and W jets are smaller. 

For a single variable, we can apply a rectangular cut to improve the SIC. For two or more 
variables, it is better to combine them using a multivariate classifier such as the Boosted 
Decision Trees method 4^ in the package TMVA [46]. The best performances for the three 
individual variables as well as all combinations are given in Tab. [H 

We emphasize here the numbers in Tab. [1] are obtained on jet samples that have passed 
the filtered mass cut, (60, 100) GeV, which has already increased the SIC by a factor of 2.2. 
Therefore, the overall improvement is the number in Tab. [1] multiplied by 2.2. We see the 
number for using filtered mass alone is 1.15, which means (60, 100) GeV is not the optimum 
mass window. The extra factor of 1.15 corresponds to a narrower mass window (72, 92) GeV 
and the best improvement one can get from filtering alone is 2.5. Better improvements are 
obtained from A^ch or T2/T1 (with the filtered mass window fixed to (60, 100) GeV): we obtain 
1.34 (1.39) by optimizing the cut on Nch (^2/^1)- 

One may also be interested in the performance of a single variable sensitive to the radia- 
tion, such as T2/T1, without imposing a filtered mass cut. It turns out one can improve the 
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FIG. 5: Variables considered for the LHC: jet mass after filtering, charged multiplicity and T2/T1, 
for pj^* G (500, 550) GeV. The filtered mass is constructed assuming HCAL granularity; A'ch and 
T2/T1 are constructed using tracks with px > I GeV and |r/| < 2.5. 



significance by a factor of 1.44 using T2/T1 calculated from charged particles ^ for jets with 
R = 1.2 and px = 500 GeV. This is lower than what we get from filtering. Nevertheless, 
T2/T1 may be more useful if we do not know the exact mass of the boosted particle. Also at 
higher pt's, the resolution for mass measurement degrades, while we expect radiation vari- 
ables to work better. This is because for higher pt's, the decay products of a color singlet 
particle occupy a smaller region while the radiation pattern of a QCD jet does not change 
significantly, which make the two cases more distinguishable. Similar observation has been 

^ The performance depends on the value of /3 in Eq. (j4]). It turns out that with a filtered mass cut, /? = 1 
is a better choice than 13 — 2. Therefore we have used /3 = 1 all through the paper. However, without a 
filtered mass cut, /3 = 2 works better, which gives a larger SIC of 1.58 for jet pt = 500 GeV. A detailed 
study of tlie /3 dependence is beyond the scope of the article. 



9 









T2/T1 


mmt 


1.15 


1.66 (1.59) 


1.67 (1.58) 


A^ch 




1.34 


1.55 (1.50) 


T2/T1 






1.39 


all: 


1.85 



TABLE 1: Optimized improvement in the SIC. The events have passed an overall filtered mass 
cut (60, 100) GeV. The diagonal elements of the first three rows are obtained by using individual 
variables with an optimized rectangular cut. The off-diagonal elements are obtained by combining 
a pair of variables: the numbers in the parentheses are obtained using rectangular cuts and the 
numbers outside are from BDT. The best improvement for combining all three variables in BDT 
is given in the last row. 







A^ch 




mmt 
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-0.08 


-0.12 




-0.08 


1 


0.51 


T2/T1 


-0.12 


0.51 


1 





mmt 


A^ch 


T2/T1 


mmt 


1 


0.07 


-0.14 




0.07 


1 


0.50 


T2/T1 


-0.14 


0.50 


1 



TABLE 2: Linear correlation matrices of the variables. Left: W jets; right: QCD jets, 
made in Ref. 



25| . In the extreme case, all decay products of a color singlet particle enters 
a single or a few adjacent calorimeter cells which makes the mass information unavailable, 
and we are forced into an inclusive search of color singlet particles without using their mass 
information. 

We can obtain better discriminating power if we combine two variables and vary the 
cuts on both of them. From Table [H we see a factor of ~ 1.6 is reached if we optimize 
rectangular cuts on both mmt and A^ch (t2/''"i), or if we combine them in BDT, the latter 
being slightly better. We also notice combining A^ch and T2/T1 gives us smaller improvement 
(1.55) than combining one of them with the filtered mass, despite the fact that each alone is 
an excellent discriminant. This is due to the larger correlation between Nch and T2/T1, both 
of which measure the amount of radiation in the jets. On the other hand, the correlation 
between N^^ (^2/^1) and mmt is small. These correlations are manifest in the two dimensional 
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distributions for each pair of variables, shown in Fig. [61 The hnear correlation matrices of 
the three variables are given in Tab. |2j 

From the above results, we draw the conclusion that if we are to use two variables for W 
jet tagging, the best way is to choose one variable sensitive to the hard splitting scale and the 
other one sensitive to the amount of radiation. These two kinds of variables characterize two 
major differences between W jets and QCD jets and they are barely correlated. This is in 
accordance with the observation we made in Sec. |2l where we saw that charged multiplicity 
increase slowly with respect to the center of mass energy. By doing so, we obtain significant 
improvement than using each individual variable. 

Finally, we can combine all three variables in BDT. Despite the sizable correlation between 
Nch and T2/T1, they still contain different information which can improve the SIC when 
combined. We show the SIC as a function of signal efficiency in Fig. [71 From Fig. [TJ we 
see that a factor of 1.85 is achieved using the optimum cut, with sizable signal efficiency of 
~ 0.3. We may also add other variables as in Ref. 2a], and the SIC will gradually saturate. 
In Ref. 25|, a set of 25 variables are used which yield a factor of 2.4. It turns out by adding 
the two extra variables, Nch, T2/T1 to the set (mfut was included in Ref. 25|]), we only obtain 
a few percent improvement, which means most of the information is redundant. However, 
we emphasize that if one would like to sacrifice performance for simplicity and choose to 
use only a few variables, the variables we have considered in this article are among the best 
ones. 



DISCUSSIONS 
A. Experimental considerations 

In the above discussions, we have not taken into account the experimental efficiency for 



reconstructing tracks in a jet. In Ref. [37|, jets with ~ 200 GeV are studied where it 
is shown the efficiency for identifying tracks is around 90% with a fake rate about 0.1%. 
The performance of our tracking variables will degrade accordingly. One may be concerned 
about whether the efficiency decrease significantly for higher pt's. This deserves dedicated 
studies using both simulations and the real data. However, we believe this is not the case 
for QCD jets. The reason is, the scaling of A^^ch at a hadron collider is similar to that of e^e~ 
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FIG. 6: Two dimensional distributions for variable pairs. Left: W jets; right: QCD jets. The 
number of events is normalized to 10k for each plot. 



machines 



40| we see in Fig. [T] As we have noticed, the number of tracks grow very slowly at 



high energies. The angular distributions of these tracks will not change significantly either. 
Therefore, the efficiency will not change significantly for QCD jets. The W jet is a different 
case: when the boost is larger, all the tracks will be packed in a smaller region, which may 
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FIG. 7: The SIC as a function of signal efficiency by combining mfiit, A'ch and r2/ri in BDT, for 
pJ^* e (500, 550) GeV. 



cause the efficiency to drop significantly. However, as we have used the fact A^^ch is smaller 
for W jets, a drop in efficiency will only make the number even smaller and will not hurt 
the discriminating power. An observation of dense tracks (or hits if tracks are difficult to 
reconstruct) in a small region combined with few tracks outside of that region demonstrates 
a very clean signal of a jet. 

Another concern about using high momentum tracks is the tracking resolution degrades 
for higher p^'s- This does not affect the charged multiplicity measurement, but affects N- 
subjettiness. However, since the momentum of the jet is shared by tens of particles, each 
charged particle does not usually have a very high px- For the 500 GeV jets we considered 
in Sec. Ej the leading track's pt is shown in Fig. |H1 from which we see the track's pt rarely 
goes above 200 GeV where the resolution of the momentum measurement is still better than 



10% 



371 138| . Even in the presence of very high pt tracks, these tracks will likely dominate 



the directions of the subjets, and according to Eq. (j4]), not significantly contribute to N- 
subjettiness. Therefore, in this article we have ignored the experimental resolution which 
should not affect our results significantly unless we are interested in jets with extremely high 
pr's (> 1 TeV). 

Particle fiow [39^ is an interesting and useful experimental approach, in which one com- 
bines the information from all subdetectors to reconstruct both charged and neutral parti- 
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FIG. 8: The leading track's px in jets with pip = 500 GeV. 



cles. Very briefly, charged particles, including muons, electrons and charged hadrons, are 
first reconstructed and their energy removed from HCAL and ECAL clusters. The remaining 
clusters are identified as neutral particles including photons and neutral hadrons. Therefore, 
particles reconstructed from the particle flow algorithm (which we call PF particles) are a 
superset of the charged particles. If PF particles are used in jet reconstruction, the vari- 
ables discussed in this paper can be naturally defined without explicitly referring to whether 
tracking or calorimeter information is used. For example, we can count the number of PF 
particles in stead of charged particles in our particle multiplicity definition. Since the PF 
particles are a superset of the charged particles, one may obtain better results than those 
quoted in this articles. Of course, for jets with a higher p-p, more particles are merged in 
single calorimeter clusters and it remains to be studied whether significant improvement can 
be achieved. 

B. Other particles 

The same method can be used on other color singlet particles such as the Z boson and a 
light Higgs boson. The major difference between these particles and the W boson is in their 
masses. As we have discussed, the small mass difference between Z and W does not affect 
significantly the charged multiplicity or N-subjettiness. Therefore, if we are only interested 
in one of them and treat the other one as a background, mass variables such as the filtered 
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mass is necessary. However, as we have discussed in the previous section, for extremely high 
Pt jets, one cannot measure accurately the mass of the color singlet particle, although it is 
easy to distinguish a color singlet particle from a QCD jet. 

Color singlet particles also differ in their spins, which affect the angular distributions 
of the decay products with respect to the moving direction of the decaying particle. For a 
transverse vector boson, one of the fermions from the decay tends to go along the direction of 
the vector boson, while the other one goes against it in the rest frame of the decaying particle. 
When boosted, this renders the momentum of one of the fermions to be much smaller than 
the other one. For a longitudinal vector boson, the two fermions tend to move perpendicular 
to the vector boson's direction which results in more balanced momentum configurations. 
Therefore, the decay of a transverse vector boson is more like a QCD splitting, which makes 
it harder to be identified. The Higgs boson, being a spin-0 particle, has its decay products 
evenly distributed in its rest frame, thus the distinguishing power for a spin-0 particle is in 
between a transverse vector boson and a longitudinal one. 

Top jets are more complicated objects. On the one hand, compared with a W jet, it 
is easier to distinguish a top jet from a QCD jet using kinematic information because it 
contains three hard subjets. One may also utilize the presence of a 6 jet and other kinematic 
variables such as the helicity angle 16|| to improve the top tagging efficiency. On the other 
hand, top quark is a colored particle, and its radiation pattern is more like a QCD jet 
than a boosted W, which makes it more difficult to use radiation variables to identify a 
top jet. Nonetheless, a top jet contains a W jet among its decay products. One may try 
to use the W tagging method or similar techniques to improve top tagging. This merits a 
detailed study. Here we only note the difference in charged multiplicity between top jets 
and QCD jets. Similar to Sec. [2l we examine top jets and QCD jets in the same fixed 
momentum configuration: we consider e^e~ — )■ tt and e'^e^ — > qqgg events. Denoting the 
beam direction as the z axis, we let the top move in the x direction and its decay products 
all lie in the y — z plane in the top rest frame. The momenta of the two quarks from W 
decay are set to be of the same size. We let the qgg from the e"'"e^ — > qqgg process to have 
the same momentum configuration as the top decay. Then we count the numbers of charged 
particles in the resulting top jets and QCD jets, which are shown in Fig. [91 We see a clear 
distinction between top jets and 3-prong QCD jets. Note that, unlike the W jet case, the 
charge is not conserved in a top jet no matter how large the jet size is. This is because the 



15 



top quark is color connected to the anti-top in the opposite hemisphere, and we expect soft 
particles in between them which can easily change the charge of the top jet from even to 
odd or vice versa. 



0.05- 
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■■■3-prong QCD jet 
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FIG. 9: Charged particle multiplicities for top jets and QCD jets with pT = 500 GeV and fixed 
momentum configuration (see text) at a e+e~ machine. 



5. CONCLUSION 

In this article, we have demonstrated that hadronically decaying boosted massive particles 
can be tagged using tracking information. Although one cannot reconstruct the masses of 
the decaying particle using tracks alone, the distributions of these tracks are sensitive to jet 
radiation patterns. In particular, the charged particle multiplicity from a color singlet parti- 
cle decay is boost invariant, which serves as an excellent discriminant between W/Z/ Higgs 
jets and QCD jets. Other jet shape variables can also be calculated using charged particles 
alone, which is complementary to variables calculated from calorimeter information. 

We have also shown jet substructure variables can be classified to those sensitive to 
the jet hard splitting scale and those sensitive to the radiation pattern, which have small 
correlations. If two or a few variables are used to distinguish massive particle jets from QCD 
jets, it is most efficient to combine variables from the two categories. We have used W jet 
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tagging as an example and demonstrated that by combining the jet filtering algorithm with 
charged particle multiplicity or N-subjettiness, we can improve the statistical significance 
by a factor of ~ 1.6 over filtering alone. This approach simplifies the multivariate method 
in Ref. 25| that utilizes as many as 25 variables, and may find applications at the LHC 
especially at its early stages. 
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